"The highlighted tokens are primarily single or multi-character morphemes in Thai, Bulgarian, and related scripts, often marking key semantic units such as roots, affixes, or important syllables within words. These tokens frequently appear in positions of morphological or syntactic significance, such as forming the core meaning of a word, indicating grammatical relationships, or serving as part of compound or derived forms. The activations suggest a focus on linguistically meaningful subword units across multiple languages."
Score Type | Accuracy | Precision | Recall | F1 score | TPR | TNR | FPR | FNR |
---|---|---|---|---|---|---|---|---|
detection | 0.77 | 0.814 | 0.7 | 0.753 | 0.7 | 0.84 | 0.16 | 0.3 |
fuzz | 0.66 | 0.6 | 0.96 | 0.738 | 0.96 | 0.36 | 0.64 | 0.04 |